Behzad Razavi

# The StrongARM Latch

The StrongARM latch topology finds wide usage as a sense amplifier, a comparator, or simply a robust latch with high sensitivity. The term "StrongARM" commemorates the use of this circuit in Digital Equipment Corporation's StrongARM microprocessor [1], but the basic structure was originally introduced by Toshiba's Kobayashi et al. [2]. The StrongARM latch has become popular for three reasons: 1) it consumes zero static power, 2) it directly produces rail-torail outputs, and 3) its input-referred offset arises from primarily one differential pair. In this column, we study the circuit and its properties.

## **Basic Operation**

Figure 1(a) shows the original Strong-ARM latch, reported in [2] without  $M_8$  and in [1] with  $M_8$ . The circuit

Digital Object Identifier 10.1109/MSSC.2015.2418155 Date of publication: 25 June 2015 was later modified as depicted in Figure 1(b) [3]. We first study the latter and then point out the differences among these versions.

The latch of Figure 1(b) consists of a clocked differential pair,  $M_1$ – $M_2$ , two cross-coupled pairs,  $M_3$ – $M_4$  and  $M_5$ – $M_6$ , and four precharge switches,  $S_1$ – $S_4$ . The circuit provides rail-to-rail outputs at X and Y in response to the polarity of  $V_{\rm in1}$  –  $V_{\rm in2}$ . We describe the operation in four phases.

In the first phase, CK is low;  $M_1$  and  $M_2$  are off; nodes P, Q, X, and Y are precharged to  $V_{\rm DD}$ ; and the circuit reduces to that shown in Figure 2(a).

In the second phase, CK goes high,  $S_1$ – $S_4$  turn off, and  $M_1$  and  $M_2$  turn on, drawing a differential current in proportion to  $V_{\rm in1} - V_{\rm in2}$ . With  $M_3$ – $M_6$  initially off, this current flows from  $C_P$  and  $C_Q$  [Figure 2(b)], thereby allowing  $|V_P - V_Q|$  to grow and possibly exceed  $|V_{\rm in1} - V_{\rm in2}|$ . That is, this phase can provide

voltage gain. We call this phase the amplification mode. Since the tail current is fairly constant during this period, we can write  $|V_P - V_Q| \approx (g_{m1,2}|V_{in1} - V_{in2}|/C_{P,Q}) t$ , where  $g_{m1,2}$  denotes the small-signal transconductance of  $M_1$  and  $M_2$ , and  $C_{P,Q} = C_P = C_Q$ .

As  $V_P$  and  $V_Q$  fall to  $V_{\rm DD} - V_{\rm TH}N$ , the cross-coupled NMOS transistors turn on (third phase), allowing part of the drain currents of  $M_1$  and  $M_2$  to flow from X and Y [Figure 2(c)]. The amplification mode therefore lasts for approximately  $(C_{P,Q}/I_{\rm CM})\,V_{\rm THN}$  seconds, where  $I_{\rm CM}$  is the commonmode (CM) current drawn from each capacitance. The voltage gain in this mode is roughly given by [4]

$$A_{\nu} \approx \frac{g_{m1,2} V_{\text{THN}}}{I_{\text{CM}}}.$$
 (1)

The behavior of the latch in the third phase can be analyzed with the aid of the equivalent circuit shown



FIGURE 1: (a) The original and (b) modified StrongARM latch topologies.



FIGURE 2: Latch operation phases: (a) precharge, (b) amplification, (c) turn-on of cross-coupled NMOS pair, (d) equivalent circuit of (c), and (e) turn-on of cross-coupled PMOS pair.

in Figure 2(d), where  $+\Delta I$  and  $-\Delta I$ represent the differential current produced by  $M_1$  and  $M_2$ . Summing currents at the four nodes yields

$$-C_X \frac{dV_X}{dt} = g_{m3}(V_Y - V_P) \tag{2}$$

$$-C_Y \frac{dV_Y}{dt} = g_{m4}(V_X - V_O) \tag{3}$$

$$-C_X \frac{dV_X}{dt} = g_{m3}(V_Y - V_P)$$
 (2)  

$$-C_Y \frac{dV_Y}{dt} = g_{m4}(V_X - V_Q)$$
 (3)  

$$-C_P \frac{dV_P}{dt} = C_X \frac{dV_X}{dt} + \Delta I$$
 (4)

$$-C_{Q}\frac{dV_{Q}}{dt} = C_{Y}\frac{dV_{Y}}{dt} - \Delta I.$$
 (5)

We subtract the second equation from the first, obtaining

$$-C_{X,Y}\frac{d(V_X - V_Y)}{dT}$$

$$= g_{m3,4}(-V_X + V_Y - V_P + V_Q).$$
(6)

Integrating both sides of (4) and (5) and combining the results, we

$$C_{P,Q}(V_Q-V_P)=C_{X,Y}(V_X-V_Y)+2\Delta It,$$
 (7)

which, upon substitution in (6), gives

$$C_{X,Y} \frac{d(V_X - V_Y)}{dt} - g_{m3,4} \left( 1 - \frac{C_{X,Y}}{C_{P,Q}} \right) (V_X - V_Y)$$
(8)  
=  $-2g_{m3,4} \frac{\Delta I}{C_{P,Q}} t$ .

If  $g_{m3,4}$  is assumed relatively constant, this equation reveals a natural response of the form  $\exp(t/\tau_{\rm reg})$ , where  $au_{\text{reg}}$  is the regeneration time constant and expressed as

$$\tau_{\text{reg}} = \frac{C_{X,Y}}{g_{m_{3,4}}(1 - C_{X,Y}/C_{P,Q})}.$$
 (9)



FIGURE 3: (a) Latch without ac cross-coupled NMOS pair and (b) the resulting static current.



FIGURE 4: The StrongARM latch followed by the RS latch.

Interestingly, the degeneration caused by  $C_P$  and  $C_Q$  raises  $\tau_{\text{reg}}$  by a factor of  $1 - C_{X,Y}/C_{P,Q}$ . Since, in practice,  $C_{X,Y}$  includes the input capacitance of the stage following the comparator and is hence greater than  $C_{P,Q}$ , the cross-coupled NMOS transistors provide little regeneration in this phase.

The output voltages  $V_X$  and  $V_Y$  continue to fall until they reach  $V_{\rm DD} - |V_{\rm THP}|$ , at which point  $M_5$  and  $M_6$  turn on [Figure 1(e)] and the circuit enters the fourth phase. The positive feedback around these transistors eventually brings one output back to  $V_{\rm DD}$  while allowing the other to fall to zero.



**FIGURE 5:** Offset cancellation by programmable capacitors.

It is important to appreciate the role of each transistor in the StrongARM latch of Figure 1(b). Besides  $M_1$ – $M_2$  and  $M_7$ , the remaining devices also serve critical purposes.

- Transistors  $M_3$ – $M_4$  cut off the dc path between  $V_{\rm DD}$  and ground at the end of the fourth phase, avoiding static power drain. To understand this point, let us omit  $M_3$  and  $M_4$  as shown in Figure 3(a) and assume a differential input,  $V_{\rm in1} V_{\rm in2}$ , of about 100 mV around a common-mode (CM) level near  $V_{\rm DD}/2$ . When the latch is clocked,  $V_X$  falls,  $V_Y$  rises, and  $M_5$  turns off. Consequently, the circuit reduces to that in Figure 3(b), drawing a static current from  $V_{\rm DD}$ . (This does not occur for rail-to-rail inputs.)
- Transistors  $M_5$  and  $M_6$  principally restore the output high level to  $V_{\rm DD}$ ; without them, the CM discharge at X or Y would yield a degraded high level (if  $|V_{\rm in1} V_{\rm in2}|$  is small).
- Switches  $S_1$  and  $S_2$  play two roles: a) remove the previous states at nodes P and Q, suppressing dynamic offsets, and b) establish an initial voltage of  $V_{\rm DD}$  at these nodes, allowing amplification before  $M_1$  and  $M_2$  enter the triode region. Both of these points distinguish the topology of Figure 1(b) from that in Figure 1(a). The original Strong-ARM latch fails to equalize  $V_P$  and  $V_Q$  accurately because  $M_8$  turns off near the end of the precharge mode. Without  $M_8$ , the dynamic offset would prove even more

serious. Moreover, the circuit has little voltage gain in the amplification mode for  $V_P$  and  $V_Q$  begin at  $V_{\rm DD}-V_{\rm THN}$ . Since in this case  $M_3$  and  $M_4$  turn on before significant gain accrues, they contribute a greater offset.

■ Switches  $S_3$  and  $S_4$  precharge X and Y to  $V_{DD}$ , ensuring that  $M_5$  and  $M_6$  remain off during the initial amplification and negligibly raise the offset.

The StrongARM latch generates invalid outputs ( $V_X = V_Y = V_{\rm DD}$ ) for about half of the clock cycle. For the subsequent logic to interpret the outputs correctly, an RS latch must follow the circuit. Figure 4 shows a typical arrangement where inverters serve as buffers between the two latches and allow the RS latch to toggle only if  $V_X$  or  $V_Y$  falls.

The power consumed by the Strong-ARM latch of Figure 1(b) arises from primarily the charge and discharge of the capacitances. It is therefore roughly equal to  $f_{CK}(2C_{P,Q} + C_{X,Y})V_{DD}^2$ , where  $f_{CK}$  is the clock frequency and the factor of 2 accounts for the discharge of both P and Q to near ground in every cycle.

#### Offset

If operating as a sense amplifier or a comparator, the StrongARM latch must achieve a sufficiently small inputreferred offset voltage. As explained in the previous section, the precharge action of  $S_1$ – $S_4$  in Figure 1(b) keeps  $M_3$ – $M_6$  off initially, thereby reducing their offset contribution. In a typical design, the mismatches between  $M_3$ and  $M_4$  are divided by about a factor of  $A_v \approx 4$  when referred to the input, and those between  $M_5$  and  $M_6$  by about a factor of ten (because these transistors turn on only near the end). Thus,  $M_1$  and  $M_2$  become the dominant contributors.

Since the amplification mode provides voltage gain by the flow of charge from  $C_P$  and  $C_Q$ , one can create asymmetry by making  $C_P \neq C_Q$  and hence cancel the circuit's offset. Illustrated in Figure 5 [5], [4], the idea is to establish different discharge rates at P and Q. Writing

the drain current of each transistor as the sum of a component proportional to  $V_{\rm in1}-V_{\rm in2}$  and a CM component,  $I_{\rm CM}$ , we have

$$V_P = V_{\rm DD} - \frac{g_{m1}(V_{\rm in1} - V_{\rm in2})}{2C_P}t - \frac{I_{\rm CM}}{C_P}t$$
(10)

$$V_{Q} = V_{DD} + \frac{g_{m2}(V_{in1} - V_{in2})}{2C_{Q}}t - \frac{I_{CM}}{C_{Q}}t.$$
(11)

It follows that

$$V_{P}-V_{Q} = -\frac{g_{m1,2}}{2} \cdot \frac{C_{P}+C_{Q}}{C_{P}C_{Q}} (V_{\text{in1}}-V_{\text{in2}}) t + \frac{C_{P}-C_{Q}}{C_{P}C_{Q}} I_{\text{CM}} t.$$
 (12)

We observe that during amplification  $V_P - V_Q$  accumulates an offset equal to  $(C_P - C_Q)/(C_P C_Q)I_{CM}t$ , which can cancel the latch's random offset. The amplification mode ends roughly when  $V_P$  and  $V_Q$  fall below  $V_{\rm DD} - V_{\rm THN}$ , and its duration is given by  $t \approx V_{\rm THN}(C_P + C_Q)/(2I_{\rm CM})$ , where  $(C_P + C_Q)/2$  is used as an approximation. The built-in offset is therefore equal to  $V_{\rm THN}(C_P/C_Q - C_Q/C_P)/2$ .

To perform offset cancellation, the main inputs are shorted together, the circuit is clocked, and the output decision drives a register that controls the values of  $C_P$  and  $C_Q$  [5], [4]. Of course, to reduce the offset from a high value (e.g., 30 mV) to a low value (e.g., 1 mV), a large number of small unit capacitors must be attached to P and Q, degrading the speed and raising the power dissipation. Another offset cancellation method for the StrongARM latch is described in [6].

#### **Electronic Noise**

From the foregoing offset studies, we can predict that the precharge action of  $S_1$ – $S_4$  in Figure 1(b) also reduces the electronic noise contributed by  $M_3$ – $M_6$ . Most of the input-referred noise originates from  $M_1$  and  $M_2$  and the kT/C noise deposited by  $S_1$  and  $S_2$  because the other transistors come into play only after significant gain has accrued. In the amplification mode, the equivalent circuit of Figure 2(b) behaves as an integrator, generating output noise from the noise of  $M_1$  and  $M_2$ . The variance



FIGURE 6: The behavior of noisy comparator with a (a) zero and (b) finite input differences.

of this voltage (a quantity akin to the mean square value) grows with time as [4], [7], [8]

$$E(V_{PQ}^2) = \frac{8kT\gamma g_{m1,2}}{C_{P,Q}^2}t.$$
 (13)

Since the amplification mode lasts about  $(C_{P,Q}/I_{CM}) V_{THN}$  seconds, we compute the final output noise variance due to  $M_1$  and  $M_2$  in this mode as

$$\sigma_{1,2}^2 = \frac{8kT\gamma}{C_{P,O}} \cdot \frac{g_{m1,2} V_{\text{THN}}}{I_{\text{CM}}}.$$
 (14)

Adding the kT/C noise contributed by  $S_1$  and  $S_2$ , dividing the result by the square of the voltage gain, and writing  $g_{m1,2} \approx 2I_{\rm CM}/(V_{\rm GS}-V_{\rm THN})_{1,2}$ , we obtain the total (integrated) input-referred noise observed in this mode as

$$\overline{V_{n,\text{in}}^{2}} = \frac{(V_{\text{GS}} - V_{\text{THN}})_{1,2}}{V_{\text{THN}}} \cdot \left[ \frac{4kT\gamma}{C_{P,Q}} + \frac{(V_{\text{GS}} - V_{\text{THN}})_{1,2}}{V_{\text{THN}}} \frac{kT}{2C_{P,Q}} \right].$$
(15)

The first term within the square brackets represents the noise due to  $M_1$  and  $M_2$  and is typically four to eight times greater than the second. Other sources of noise are quantified in [4].

While not specific to the Strong-ARM latch, the simulation of noise in comparators poses interesting issues. Unlike small-signal analog circuits, a comparator does not directly provide

an output noise and a gain by which the noise should be divided. For simpler topologies, one can place the comparator in a metastable condition and perform a small-signal analysis, but the StrongARM latch completes switching actions and noise injections even before the output begins to change. A methodical simulation proceeds as follows. Suppose a comparator with a zero offset and a zero differential input is clocked many times (we assume the simulator includes noise in transient simulations). Then, the Gaussian noise within the circuit allows eventual recovery from metastability, producing ones and zeros at the output with equal probabilities [Figure 6(a)].

In the next step, we apply a small differential input (a few millivolts) as shown in Figure 6(b) and repeat the simulation. Since  $V_S$  skews the comparator decisions, the ones and zeros occur with unequal probabilities; zeros appear only if the input-referred noise is more negative than  $-V_S$ .



FIGURE 7: Kickback noise paths.



FIGURE 8: (a) An alternative topology for lower kickback noise and (b) behavior in the precharge mode.

For a large number of clock cycles, therefore, we predict that the number of zeros at the output,  $n_0$ , is proportional to the area under the Gaussian probability distribution function,  $f_X(x)$ , from  $-\infty$  to  $-V_S$ ; the number of ones,  $n_1$ , is proportional to the area from  $-V_S$  to  $+\infty$ . Based on the numbers observed in the simulations, we can write

$$\frac{\int_{-\infty}^{-Vs} f_X(x) \, dx}{1 - \int_{-\infty}^{-Vs} f_X(x) \, dx} = \frac{n_0}{n_1}$$
 (16)

and hence compute the variance of  $f_X(x)$ , which corresponds to the input-referred noise voltage squared. The value of  $V_S$  must be chosen large enough to ensure  $n_0/n_1$  substantially departs from unity but not so large that  $n_0$  or  $n_1$  is excessively small and statistically insignificant.

#### **Kickback and Supply Transients**

The StrongARM latch draws high transient currents from the inputs and the supply. These transients become troublesome if a large number of comparators operate in parallel, as in a flash analog-to-digital converter.

The "kickback" currents drawn from the inputs stem from several mechanisms (Figure 7), exhibiting both differential and CM components. The former appear mostly as  $V_P$  and  $V_Q$  fall toward ground at unequal rates and couple to the inputs through  $C_{\rm GD1}$  and  $C_{\rm GD2}$ . This effect becomes

more pronounced as  $M_1$  and  $M_2$  enter the triode region and their gate-drain capacitances increase. The CM kickback noise currents are much greater and occur when  $M_7$  turns on, initially drawing its drain current from  $C_{\rm GS1}$  and  $C_{\rm GS2}$ , and when it turns off, with CK coupling through  $C_{\rm GD7} (\approx C_{\rm GS7})$  to  $C_{\rm GS1}$  and  $C_{\rm GS2}$ .

The StrongARM latch draws high transient currents from the inputs and the supply.

It is possible to reduce the kickback noise by clocking the input devices through their drain path rather than their source path. Depicted in Figure 8(a) [9], such a topology incorporates  $M_7$  and  $M_8$ to control the latch. However, the kickback noise is lowered at the cost of a higher input offset because  $M_1$  and  $M_2$  now operate in the triode region during the amplification mode. This issue can be avoided by making  $M_3$ - $M_4$  and  $M_7$ - $M_8$  wide; but, as illustrated in Figure 8(b), the slow discharge at A or B in the precharge mode leads to significant imbalance between  $V_P$  and  $V_Q$  and hence a large dynamic offset.

The supply transient currents originate from the precharge action of  $S_1$ - $S_4$  in Figure 1(b). If CK falls fast, three of  $S_1$ - $S_4$  momentarily enter the saturation region (the fourth one is in the triode region because its drain voltage is equal to  $V_{\rm DD}$ ) and pull a large current from  $V_{\rm DD}$ . The key point here is that designs consuming a low average power may still draw high peak currents from the supply, dictating a low supply impedance.

### **Questions for the Reader**

- 1) Do  $V_P$  and  $V_Q$  in Figure 1(b) reach 0 V at the end of the regeneration phase?
- 2) Explain why  $M_3$  and  $M_4$  in Figure 1(b) can be omitted if the inputs have rail-to-rail swings.
- 3) Explain why the coupling through  $C_{\text{GD7}}$  in Figure 7 is less on the rising edge of CK than on the falling edge of CK.

You can share your thoughts with me by sending an e-mail to razavi@ ee.ucla.edu.

# **Answers to Last Issue's Questions**

1) Can we use a negative impedance converter (NIC) in a PA predriver to cancel the input capacitance of the output stage?

Since an RF predriver typically uses a resonant load, the NIC would cause oscillation. If injection-locking is desired in this stage, a simple cross-coupled pair suffices.



FIGURE 9: A cross-coupled pair using bulk terminals.

2) How does the thermal noise contributed by  $M_1$  and  $M_2$  in Figure 9 to  $V_{\text{out}}$  compare to that by a regular XCP?

This circuit produces a noise voltage across  $R_L$  equal to  $\sqrt{2} g_m R_L / (2 - g_{\rm mb} R_L V_n)$ , where  $V_n$  denotes the gate-referred noise of each

transistor and the noise of  $R_L$  is neglected. For a regular cross-coupled pair, the output noise is given by  $\sqrt{2} g_m R_L / (2 - g_m R_L V_n)$ . For a fair comparison, the total resistance seen at the output must be the same for the two topologies, thus yielding the same output noise.

#### References

- [1] J. Montanaro, R. Witek, K. Anne, and A. Black, "A 160-MHz 32-b 0.5-W CMOS RISC microprocessor," *IEEE J. Solid-State Circuits*, vol. 31, pp. 1703–1714, Nov. 1996.
- [2] T. Kobayashi, K. Nogami, T. Shirotori, and Y. Fujimoto, "A current-mode latch sense amplifier and a static power saving input buffer for low-power architecture," in Proc. VLSI Circuits Symp. Dig. Technical Papers, June 1992, pp. 28–29.
- [3] Y. T. Wang and B. Razavi, "An 8-bit 150-MHz CMOS A/D converter," *IEEE J. Solid-State Circuits*, vol. 35, pp. 308–317, Mar. 2000.

- [4] P. Nuzzo, F. De Bernardinis, P. Terreni, and G. Van der Plas, "Noise analysis of regenerative comparators for reconfigurable ADC architectures," *IEEE Trans. Circuits Syst. I*, vol. 55, pp. 1441–1454, July 2008.
- [5] M. J. E. Lee, W. J. dally, and P. Chiang, "Low-power area-efficient high-speed I/O circuit techniques," *IEEE J. Solid-State Circuits*, vol. 35, pp. 1591–1599, Nov. 2000.
- [6] M. Yoshiyoka, K. Ishikawa, T. Takayama, and S. Tsukomato, "A 10-b 50-MS/s 820uW SAR ADC with on-chip digital calibration," *IEEE Trans. Biomed. Circuits Syst.*, vol. 4, pp. 411–418, Dec. 2010.
- [7] S.W. Chiang and B. Razavi, "A 10-bit 800-MHz 19-mW CMOS ADC," IEEE J. Solid-State Circuits, vol. 49, pp. 935-949, Apr. 2014.
- [8] T. Sepke, P. Holloway, G. Sodini, and H. S. Lee, "Noise analysis of comparator-based circuits," *IEEE Trans. Circuits Syst. I*, vol. 56, pp. 541–553, Mar. 2009.
- [9] R. J. Baker, CMOS Circuit Design, Layout, and Simulation. Wiley: Hoboken, NJ. Wiley, 2010.

SSC



Support the IEEE Electron Devices Mission Fund of the IEEE Foundation.

**IEEE** Foundation



Learn More at http://bit.ly/IEEE-EDS-MissionFund

